Complex organic matter degradation by secondary consumers in chemolithoautotrophy-based subsurface geothermal ecosystems

Microbial communities in terrestrial geothermal systems often contain chemolithoautotrophs with well-characterized distributions and metabolic capabilities. However, the extent to which organic matter produced by these chemolithoautotrophs supports heterotrophs remains largely unknown. Here we compared the abundance and activity of peptidases and carbohydrate active enzymes (CAZymes) that are predicted to be extracellular identified in metagenomic assemblies from 63 springs in the Central American and the Andean convergent margin (Argentinian backarc of the Central Volcanic Zone), as well as the plume-influenced spreading center in Iceland. All assemblies contain two orders of magnitude more peptidases than CAZymes, suggesting that the microorganisms more often use proteins for their carbon and/or nitrogen acquisition instead of complex sugars. The CAZy families in highest abundance are GH23 and CBM50, and the most abundant peptidase families are M23 and C26, all four of which degrade peptidoglycan found in bacterial cells. This implies that the heterotrophic community relies on autochthonous dead cell biomass, rather than allochthonous plant matter, for organic material. Enzymes involved in the degradation of cyanobacterial- and algal-derived compounds are in lower abundance at every site, with volcanic sites having more enzymes degrading cyanobacterial compounds and non-volcanic sites having more enzymes degrading algal compounds. Activity assays showed that many of these enzyme classes are active in these samples. High temperature sites (> 80°C) had similar extracellular carbon-degrading enzymes regardless of their province, suggesting a less well-developed population of secondary consumers at these sites, possibly connected with the limited extent of the subsurface biosphere in these high temperature sites. We conclude that in < 80°C springs, chemolithoautotrophic production supports heterotrophs capable of degrading a wide range of organic compounds that do not vary by geological province, even though the taxonomic and respiratory repertoire of chemolithoautotrophs and heterotrophs differ greatly across these regions.


Introduction
Terrestrial geothermal systems emit volatiles from the Earth's interior (i.e., mantle and crust) to the atmosphere. Often, meteoric water permeates into the subsurface hydrothermal system, where it is heated and rises to the surface, bringing with it volatiles from the deep subsurface [1]. The differential enrichment of these volatiles into geothermal fluids creates environmental niches that can be saturated with deeply-derived inorganic carbon and other compounds [2][3][4]. In convergent margins such as those of the Central American and the Andean Central Volcanic Zone, inorganic carbon is derived from the mantle, overlying crust and/or down-going slab [5]. In divergent spreading centers and/or areas with mantle plume-influenced volcanism, such as Iceland, geothermal systems are often dominated by deep mantle gases (e.g., Harðardóttir et al. 2018 [6]). These geological systems create a large diversity of surface-emitting springs that range in temperature, pH, inorganic carbon content, and availability of redox active compounds that together make the driving force of microbial community composition [4,7,8]. Sampling fluid emissions from natural surface springs provides access to these deeply-sourced microbial communities and the volatiles that support them [8][9][10][11].
The important role that chemolithoautotrophs play in these geothermal ecosystems is wellestablished (e.g., [4,[12][13][14]). The heterotrophic communities within these systems are less often studied, even though heterotrophs have been shown to be dominant within heavily-sedimented subsurface ecosystems (e.g., [15]). These heavily-sedimented systems do not have a constant supply of redox active volatiles and are therefore dependent on allocthonouslyderived organic matter, and likely differ greatly from geothermal ecosystems. Recent work has focused on understanding the heterotrophic community within terrestrial geothermal systems [16,17] and many industrially useful carbohydrate-and peptide-degrading enzymes have been isolated from these microbial communities [18,19]. Specifically, carbohydrate active enzymes and peptidases have been found in hot spring fluids [17,20,21]. However, a survey of the full complement of all the carboyhdrate active enzymes and peptidases have not been made from metagenomes from hot springs, to our knowledge. The taxonomy and respiratory pathways of primary producers and heterotrophs are known to vary along geological gradients according to changes in deep volatile delivery [4,7,8]. However, it is not known whether organic carbon degradation pathways vary along with them. Different types of chemolithoautotrophs may promote different compositions of carbon-degrading enzymes in the ecosystems they support. Thus, it is important to investigate the heterotrophic community because it actively participates in the geochemical cycle of terrestrial geothermal environments by consuming organic carbon and releasing inorganic carbon.
Hot springs typically contain little photosynthetically-derived organic matter, potentially leading heterotrophs to depend primarily on the byproducts of the chemolithoautotrophic community [22]. To access these organic byproducts, heterotrophs use extracellular enzymes to break down larger organic molecules into smaller molecules that can permeate their cell membrane more readily [23] ultimately recycling carbon through this autotroph-heterotroph mutualism. The composition of carbon-degrading enzymes may therefore show whether chemolithoautotrophy or photosynthesis is more important for heterotrophic communities. Here, we focus on two broad classes of extracellular enzymes: peptidases which break down proteins and carbohydrate-active enzymes (CAZymes) which break down polysaccharides and related macromolecules.
We compare springs across the Central American and the Andean (Argentinian backarc of the Central Volcanic Zone) convergent margins as well as the plume-influenced Iceland plate boundary. These regions are defined by their position across the convergent margin or continental intra-plate setting toward an oceanic plate boundary. The Costa Rican subduction zone is driven by the Cocos-Nazca plate subducting under the Caribbean plate. Northern Costa Rica is characterized by having higher volcanic activity than the other areas [24]. The Panama slab window is a result of a tear within the Nazca plate where arc volcanism ceases [25]. The formation of the slab window is also responsible for the cease in volcanism within the Cordillera Talamanca region [26], and a change in the rock chemistry in the area that shows hot spot-like compositions [27]. The Andean convergent margin is driven by the Nazca plate subducting under the South American plate [28], whereas Iceland is associated with spreading along the Mid-Atlantic Ridge under the influence of a mantle plume (e.g., [2,29,30]). Environmental factors also vary significantly between the sites due to large variations in latitude, elevation, and rainfall. This wide range of geological and environmental settings provides an opportunity to study geochemically diverse springs. Sites from Argentina and Iceland were placed into their own categories because they have different tectonic processes than those of Central America. The sites are then color coded based on these different geological processes that distinguish them (Fig 1).
Enzyme assays and bioinformatics analyses on metagenomes were used to analyze the microbial community interactions within diverse geothermal systems. To study the heterotrophic activity within geothermal systems, assemblies from the Costa Rican convergent margin, the Argentina backarc of the Andean convergent margin and the subaerial section of the Mid-Atlantic ridge (Iceland) were annotated using dbcan2 database and the run_dbcan package to find CAZymes and DRAM to annotate the MEROPS families [32,33]. MEROPS classifies proteolytic enzymes using hierarchical classification by homologous sequences [34]. The CAZyme database splits carbohydrate-active enzymes into classes of glycoside hydrolases (GH), glycosyl transferases (GT), polysaccharide lyases (PL), carbohydrate esterases (CE), auxiliary activities (AA), and carbohydrate binding molecules (CBM), defined by sequence similarity [35]. Enzyme commission numbers are based on the reactions catalyzed instead of sequence homology [36]. Using these different tools allows for a broad analysis of the potential organic matter degrading functions of proteins based on their sequence homology. Hierarchical clustering and principal coordinate analysis were used to find correlations between the sites and the enzyme families found within them. By combining the maximum potential enzymatic activity, measured by low molecular mass fluorogenic substrate proxies, with the metagenomic annotations, we can see a larger scope of the potential heterotrophic activity within these sites.

Sampling
DNA extraction and sequencing for all samples has been previously described [3,4] Sample collection was performed following the protocols previously described [3,4,7] and using the rationale described by Giovannelli et al. 2022 [11]. Sites are color coded based on province (Fig 1) [31]. GPS coordinates, site names, temperature and pH measurements are shown in Table 1. Coauthors located at the University of Salta, Observatorio Volcanológico y Sismológico de Costa Rica (OVSICORI) Universidad Nacional, and NordVulk, Institute of Earth Sciences, University of Iceland, regularly visit these sites and required local verbal permission   from landowners for each one of them. The samples we report here have been published previously [3,4,7,37]. From each site, temperature and pH were measured directly in the fluids using a portable YSI Plus 6-Series Sonde Multimeter (YSI Incorporated, Yellow springs, OH) and 0.5 to 1.5 liters of hydrothermal fluids venting from the subsurface were collected. Care was taken when collecting fluids to do this as close to the perceived fluid source as possible. Fluids were immediately filtered through Sterivex 0.22 μm filter cartridges (MilliporeSigma) and quick-frozen onsite in a liquid nitrogen-cooled dry shipper. When fluid sampling was complete and to avoid resuspension,~10 mL of surface sediments constantly overwashed by the venting source were placed into a sterile plastic vial and frozen onsite along with the filters. Sample names ending in (F) are from filtered fluids and those ending in (S) are from surface sediments.

Bioinformatic processing
For the Iceland metagenomes, raw reads were trimmed with Trimmomatic (v 0.39) [38] and assembled using the MetaWRAP (v 1.3.2) pipeline [39]. The quality of the Iceland assemblies was determined using Quast (v4.4) [40] on Kbase [41]). All other assemblies were generated by trimming raw reads with Trimmomatic (v 0.38). Reads were assembled de novo with metaS-PAdes with a minimum contig length of 1.5 kb [7]. Reads from LC19F, LC19S, RF19S, and TM19S were assembled using MEGAHIT (v 1.2.9) [42]. The assemblies were annotated using prokka (v 1.14.5) [43] and dbcan2 [32]. For peptidase annotations the assemblies were uploaded to KBase [41] and annotated with DRAM (v.0.1.0) [33]. Secreted proteins were identified using SignalP (v 2.0) [44]. The clean reads were mapped back to the assemblies for read coverage using bowtie2 (v 2.3.5.1) and samtools (v1.15.1) [45,46]. Hierarchical clustering based on spearman correlation, was performed using hclust from the base R package (R version 4.2.1). Total microbial community analysis of these sites is the focus of previous work, therefore, taxonomic identification was only performed for contigs for the high temperature sites HV121S, HV221S, and KR21S using gottcha2 on Kbase (S1 Fig in S1 File) [4,7,[47][48][49]. CAZy family abundances were annotated using dbcan2. Annotations were selected if they were annotated with at least two tools: HMMER and DIAMOND. The annotated gene IDs were then combined with the prokka gene ID and contigs to gain the abundance of each contig. Read coverage was calculated as previously described [7] Then the read coverage of each annotation normalized to the total assembly size was used to generate a heatmap with hierarchical clustering of the CAZyme families and site locations. All annotations that are presented also were annotated for having a signal peptide sequence by SignalP [44]. Protein annotations that are not present within 75% of the assemblies were removed for better visualization. Enzyme commission numbers were assigned with dbcan2 database and combined with the SignalP annotations to estimate secretion. Enzyme commission groupings are only shown for class 3 hydrolases where the enzyme commission number was present in at least 75% of the assemblies. MEROPS peptidase annotations were completed using DRAM on Kbase and annotated using SignalP. The gene IDs were then matched to contigs of the assemblies to get the normalized read coverage.

Enzymatic assays
Enzymatic assays were performed following the methods of Bell (2013) [50] with slight modifications. Briefly, sediments were weighed out at 2.75 grams wet weight. The sediments were then combined with 91 mL of 0.5 M Tris-HCl buffer with a matching pH to the original site locations. The sediment slurries were blended for one minute to homogenize them. Then 800 μL of each slurry was pipetted into deep well plates in duplicate. 200 μL of substrates with a concentration of 200 μM were then added to each well. The substrates used were 4-Methy- . Each deep well plate was incubated for 3 hours at 30-70˚C, and time points were taken at 0 hours, 1.5 hours, and 3 hours. To take each time point, 200 μL was pipetted from the deep well plate to a black flat bottom 96 well plate. The fluorescence was measured at 455 nm after excitation at 355 nm using a Tecan Infinite M200 Pro Fluorimeter. Differences in enzyme activities among provinces were tested using a Kruskal-Wallis test, implemented in R, due to the strong non-normal distribution of activities in the data set.

Sites
Three different geographical areas were analyzed: the Central American and Andean convergent margins, and Iceland (mantle plume-influenced spreading center). Large variations in fluid sources (i.e. mantle, slab, crust, or surficial) are expected due to the contrasting geologic and environmental settings of the studied areas. We include data from 22 sites in Costa Rica that are influenced by the convergent margin, with analyses of the geochemistry, respirations, and taxonomic identities of the 22 sites published previously [3,4,7,51]. We include 47 additional sites spanning Costa Rica and Panama that are also influenced by the convergent margin, 13 sites from the Andean convergent margin and 3 sites from Iceland. Sites were grouped by geographic-tectonic setting: Costa Rica outer forearc, Costa Rica active volcanic arc, Costa Rica backarc, Cordillera Talamanca, Panama, Argentina backarc, and Iceland (Fig 1, Table 1). These groupings allow us to explore large variations in fluid sources and physicochemical characteristics which are ultimately related to their tectonic setting. For example, in Costa Rica the distance to the trench (outer forearc to active arc to backarc) correlates with large variations in temperature, pH, and mantle-derived components that affect chemical compositions of the fluids and the microbial communities [3,4]. In total 13 sites have the lowest temperatures (19.2-29.9˚C) and eight sites have the highest temperatures (80-93.5˚C) ( Table 1). Seven sites have the lowest pH (0.85-3.21), while 11 sites have the highest pH (9.0-10.0).

Enzyme activities
Most of the hydrolysis rates measured were indistinguishable from zero (Fig 2). This could mean either that few of the enzymes were being expressed at the time of sampling or that we were unable to properly recreate the geochemical conditions present in situ. Of the enzymes tested, carbon-acquiring enzymes (AG, BG, CB, XYL) were more active than enzymes associated with phosphorus and sulfur acquisition (PHOS and SULF) (S3 Table in S1 File). LEU hydrolysis was orders of magnitude lower than the other enzymes assayed, in contrast to soils where LEU often has high hydrolysis rates [52]. NAG hydrolysis was positive at more sites than the other substrates. PHOS hydrolysis had only one site with zero activity, CL18S. From the Kruskal-Wallis statistical analysis, no significant differences (p>0.05) were observed for any enzyme activities between provinces.

Metagenomic assemblies
Of the three metagenomic assemblies from Iceland, HV221S has 4,344 contigs, KR21S has 3,214, and HV121S has 912 (S1 Table in S1 File). The total length of these assemblies is 5,645,364 bp for HV121S, 7,802,221 bp for HV221S, and 12,561,178 bp for KR22. The total contig numbers for the Costa Rica, Panama and Argentina assemblies range from 488 to 164,798 (S2 Table in S1 File). The total sizes of the Costa Rica, Panama and Argentina assemblies range from 8,678,655 to 509,373,734 (S2 Table in S1 File). Genus level taxonomy classification was done using gottcha2 on Kbase (S1 Fig in S1 File) [41,47]. At the genus level, HV121S has 70% of reads annotated as Sulfolobus, 15% as Acidianus, and the 8% as Thermoproteus.The remaining reads are distributed across Methylorubrum, Pseudomonas, and Stenotrophomonas. HV21S had 38% of reads annotated as Sulfolobus, 21% as Thermoproteus, 10% as Cutibacterium, and the rest are distributed across other genera. KR21S has 45% of reads identified as Acidianus, 8% as Sulfolobus, and 5% as Thermoproteus, with the remaining taxonomy distributed across other genera.

Peptidases
In total, there are 144 MEROPS family annotations predicted to be secreted (Fig 3), M (Metallo) with 59, S (Serine) with 32, C (Cysteine) with 29, A (Aspartic) with 9, N (Asparagine) with 5, T (Threonine) with 4, U (Unknown) with 4, G (Glutamic) with 1, and P (Mixed) with 1 (Fig 3). Of these 144 families, 88 are present in every assembly. The total read abundance of annotations for MEROPS is 283,488,080, which is much greater than those of CAZy at 1,729,668, and EC3 at 633,511, which are discussed below. Important MEROPS families are listed in Table 2.

CAZymes
CAZy families predicted to be secreted are present in all sites at orders of magnitude lower read abundance than peptidases. Of the six CAZy classes, the most abundant are the glycosyl hydrolases (GH). CAZy families that are abundant in all sites are GH23 (peptidoglycan lyase),   Within the spearman correlation of the site clustering, we see a grouping of sites that are from the Argentina backarc and have a higher temperature range (Fig 4). These sites cluster together due to the low abundance of enzyme annotations within their assemblies. Some sites' sediment and fluid samples cluster together such as, LB18, SC18, BJ19, QN17, AO19, RS17, QH217, ER18, and LC19.

CAZy degradation groupings
We summed the abundance of CAZy families that are either part of chitin/peptidoglycan degradation (bacterial necromass) or xylan degradation (plant products) (Tables 2 and 3). In total cell-degrading enzymes are in higher read abundance (521,794.37) than photosynthatedegrading enzymes (227,434.98) ( Table 3). The assemblies with the highest percentage of cell-  degrading enzymes are BJ18F, SI17F, KR21S, ES17F, RV17F, and CI18F. Only 20 assemblies have fewer than 50% of the cell-degrading enzyme annotations. The abundance of these enzymes for each site, with sediment and fluid components, is shown in a stacked bar plot to visualize the differing quantity across sediments and fluids (Fig 6). Overall, cell-degrading enzymes are more abundant than photosynthate-degrading enzymes. Photosynthate-degrading enzymes are more abundant in sediments than in fluids, whereas the cell-degrading enzymes are more abundant in fluids than in sediments (Table 3). CAZy family GH103 is more abundant in fluids than in sediments (Fig 6). CAZy families GH18 and GH3 are more abundant in sediments than fluids. https://doi.org/10.1371/journal.pone.0281277.g005 Table 3. Sums of enzymes associated with cell and photosynthate degradation. Cell enzymes are in the chitin and peptidoglycan groups of Table 3. The photosynthate enzymes are in the groups xylan and cellulose of Table 3. The total of these two groups are summed for each site.

Multivariate analysis
We performed multivariate analysis on the datasets using a principal component analysis (PCA) to perform an unconstrained coordination analysis. For cell-degrading enzymes, sites do not cluster based on province ( Fig 7A). Instead, we see that high temperature sites (HV121, HV221, KR21, GF119, GF219, BQ1, and BQ2) cluster together while all other sites are indistinguishable based on cell-degrading enzyme abundance. Photosynthate-degrading enzyme abundances, however, do cluster by provinces, with the Costa Rica active volcanic arc and Argentina active volcanic backarc tending to group together. Sites with volcanic activity correlate with CAZyme families GH1, GH5, GH9, GH51, and GH116, which are specifically related to cellulose degradation [58]. Here, high temperature sites continue to cluster together, but do not correlate with any specific enzymes. The nonvolcanic sites correlate with CAZy families GH16, GH43, GH74, GH11, and GH67. Families GH11, GH43, and GH67 are known to be directly related to xylan degradation [59].
To further support the idea that province separation is driven by photosynthate-degrading carbohydrate-active enzymes, rather than all peptidases, a PCA analysis was done on all the MEROPS, CAZy, and EC family annotations, as well as the annotations that correspond to the enzymes that were the target of the activity assays (S2-S5 Figs in S1 File). These PCA analyses show that there are no distinct clusters based on province. However, we see province clustering for a PCA plot of only cell-degrading and photosynthate-degrading CAZy enzymes (S6 Fig in S1 File).

Discussion
All 63 sites produced a high diversity of enzymes predicted to be capable of breaking down organic matter outside the cell. This suggests that hot spring communities can break down a wide variety of organic compounds ranging from proteins and carbohydrates to structural molecules, as has been previously suggested [17,[61][62][63]. Below we will describe how the distribution of organic carbon degrading enzymes across these sites suggest these communities are primarily supported by microbial biomass, rather than plant detritus, consistent with a chemolithoauotrophically-based ecosystem. But these heterotrophic enzymes are less directly influenced by geological features than the taxonomic compositions or respiratory capabilities of these communities [64][65][66][67].

Nature of the heterotrophic enzymes across all seeps and hot springs
We propose that the heterotrophic community in these terrestrial seeps and hot springs primarily use the organic matter of dead bacterial necromass produced in situ rather than allochthonous surface-derived material from plant matter. The heterotrophic community may include autotrophs that are also capable of metabolizing organic compounds. To consume complex organic matter, heterotrophs rely on extracellular enzymes such as CAZymes and peptidases to degrade the larger organic molecules so they can bring them into the cell [64]. CAZymes are important for geochemical cycling because they facilitate the breakdown of complex carbon substrates [65]. Another subset of enzymes involved in the biogeochemical cycling of heterotrophs are peptidases, which cleave peptide bonds between amino acids [60,66]. Protein degradation has  been shown to be important for heterotrophs within the subsurface [67,68]. Heterotrophs within these springs may rely on protein from dead cells (necromass) for carbon and nitrogen acquisition, as they do in marine sediments [67,69]. EC annotations are based on the reactions catalyzed which allows for analysis of different reactions within the assemblies, and much of the important heterotrophic enzymes fall under EC3, hydrolases [36]. The enzyme annotation methods do not distinguish between prokaryotic and eukaryotic sources [34,36,65]. Thus, the set of enzymes present in an environment can indicate the chemical nature, and therefore the source, of organic matter being consumed by heterotrophs [70]. An ecosystem supported by bacterial and archaeal cell necromass should contain more enzymes involved in peptidoglycan and chitin degradation such as chitinases, N-acetylglucosaminidase, and lysozymes [53][54][55][56]. Enzymes that are involved in the degradation of plant material, which in the case of subsurface springs would indicate allocthonous material from the surface, are often cellulases, xylanases, and glucosidases [57][58][59]. These enzymes have been characterized into CAZy families (Table 2). In our metagenomic assemblies, enzymes associated with cell necromass degradation are the most numerous overall, i.e., twice the amount of photosynthate-degrading enzyme families (Table 3). Enzymes for cell necromass consumption include those that break down peptidoglycan, a key component in bacterial cell walls. CAZyme families associated with peptidoglycan degradation are GH103, GH102, GH73, GH25, GH24, GH23, and CBM50 (Table 2) [56]. These families are found in most of the assemblies (Fig 4). The CAZyme families with the highest abundance are GH23 (peptidoglycan lyase) and CBM50 (LysM domain). These two CAZyme families are integral to the degradation of peptidoglycan.
Another facet of the heterotrophic community utilizing necromass for essential nutrients is the acquisition of starch, glycogen, and trehalose [71]. These compounds may also be derived from photosynthate materials. The CAZy families that are present within all assemblies are from the families GH13, GH15, GH31, GH57, GH122, and GH133 (Fig 4), which are involved in starch degradation [72]. This suggests that the possibility of using starches is a common heterotrophic metabolism across these sites.  The peptidase annotations also support the idea that the heterotrophic community relies on necromass for carbon and energy acquisition. Peptidase families such as S11, M23, S13, S66, M15, M74, M14, and C51 are used for peptidoglycan degradation (Fig 3). These families' biological functions are associated with lysis or degradation of bacterial cell walls [53]. Of the eight listed cell wall degrading/lysing peptidases, all except M74 and C51 are present in every assembly. Peptidase family M74 is present in all but four of the assemblies, GF119F, GF119S, GF219F, and GF219S. Peptidase family C51 is present in all but eight of the assemblies, BQ117F, GF119F, GF119S, GF219F, GF219S, HV121S, KR21S, and RF19S. All the samples lacking M74 and C51 fall within the temperature range (80-93.5˚C) in magmatic steam-heated springs. This may suggest that M74, which is a murein endopeptidase and C51, a D-alanyl-glycyl peptidase, are not adapted to high temperatures.
CAZyme families associated with plant degradation have lower total abundance in comparison to cell degradation even though there are more possibilities for plant degrading families to be found (Table 3). Of the 45 assemblies that have more than 30% plant degrading enzymes out of all the enzymes related to cell and plant degradation ( Table 2), 34 are sediment samples (Table 3). Therefore, sediment assemblies contain more annotations for photosynthatedegrading enzymes than fluid samples. This may be caused by additions of photosyntheticallyderived material to the sediments deposited at the surface.
The dominance of cellular biomass as the main organic matter source for the heterotrophic community is consistent with chemolithoautotrophic biomass forming the major primary production. This agrees with the findings from Barry et al., 2019a, who used helium and carbon isotopes from the same sites to show that the dissolved inorganic carbon was almost entirely derived from deep (i.e., mantle and subduction-related) sources [3]. Since hot spring enzymes related to cell necromass are more abundant in our metagenomes, we propose that hot springs microbes primarily subsist using necromass derived from chemoautotrophs rather than surficially-derived organic carbon ultimately derived from plants.

Independent confirmation of enzyme activity with substrate proxy analyses
Extracellular enzyme assays were used as a proxy for the activity of heterotrophic organisms within the seeps and hot spring derived sediment samples. Enzymes that are commonlyassayed in soils are often associated with cellulose and lignin degradation along with enzymes that hydrolyze proteins [70]. Degradation of plant litter is often demonstrated using assays for AG, BG, CB, and XYL, as they degrade cellulose and xylan. LEU assays represent carbon and nitrogen acquisition from proteins alongside NAG which is a proxy of the degradation of peptidoglycan and chitin [73]. Extracellular enzyme assays can reveal different organic carbon and nitrogen sources for microbial communities, allowing for inferences about biogeochemical cycling within the system. Enzymes involved in carbon and nitrogen acquisition have been used to demonstrate the limitations of nutrients within various systems [70,73,74]. The results for these assays predicted that these organisms are not highly active (S4 Table in S1 File). However, the fact that activity could be observed at all suggests that at least some of the enzymes we identified in our metagenomic annotations were active in natural samples and the lack of extracellular enzymatic activity may be the result of not recreating the ideal geochemical parameters for the hydrolysis within the lab setting [75].

Highest temperature magmatic steam-heated springs have low diversity of heterotrophic enzymes
Lower diversity microbial communities tend to be found at very high temperature springs [55]. Accordingly, our high temperature and low pH sites (GF219S, GF119F, GF219F, HV121S, and KR21S) have fewer annotations for both CAZymes and peptidases than more mesophilic springs. This is likely due to the small number of organisms present within these sites (S1 Fig in S1 File). Although these sites are driven by heat and volatiles originating from the subsurface, they are not representative of deep subsurface communities. The depth of the subsurface biosphere at these sites is shallow (<50m) due to the high heat flow in the area and near boiling temperature that limit the possible distribution of microorganisms at depth [11]. The shallow subsurface nature and lower residence time of the communities in these sites combined with the lack of phylogenetic diversity is the likely cause of the lower functional diversity of the heterotrophic community.
The extracellular enzymatic assays demonstrate that many high temperature sites may not be expressing the enzymes found within their metagenomes (S4 Table in S1 File). The sites with the highest extracellular activities are within the temperature range of 32.5-50˚C. Extracellular enzymes require energy to produce, so organisms that are already spending energy surviving in very high temperature and low pH sites may not have excess capacity for extracellular enzyme production [67,76,77]. Therefore, because the hyperthermic sites are energylimited they are not expressing as many extracellular enzymes as the more mesothermic springs.

Heterotrophic metabolism is not influenced by geological processes
Chemolithoautotrophic metabolisms and the redox couples that provide power for them vary depending on geological province across the Central America convergent margin [4,7]. We hypothesized that the heterotrophic community's enzymatic functions would also vary across the provinces, since their taxonomic identities and respiratory properties do [4,7]. However, most of the extracellular enzymes in our metagenomic assemblies do not correlate with geological provinces. Based on hierarchical clustering of spearman correlations, all sites except for the very high temperature magmatic steam-heated sites are difficult to distinguish based on abundance of predicted extracellular enzymes for peptide and carbohydrate degradation. Even though the populations of chemolithoautotrophs vary by geological setting, the biomass they produce may be similar enough that it can be broken down by similar sets of enzymes ( Fig  7A), even though the heterotrophic taxa that make these enzymes also vary by geological setting [4,7].
One group of extracellular enzymes, however, does differentiate by geological province. The PCA analysis of the photosynthate-degrading enzymes shows slight clustering of the Costa Rica active volcanic arc and backarc and the Argentina backarc ( Fig 7B). These sites have the same amount of photosynthate-degrading and cell-degrading enzymes as the other provinces, but the composition of the photosynthate-degrading enzymes more closely reflects cellulose degradation with enzymes such as GH5, GH9, GH116, and GH1. Springs and seeps that lack direct magmatic influence, such as the Costa Rica outer forearc, the Cordillera Talamanca, and Panama cluster separately from the volcanic sites, with more xylan-degrading enzymes. When all CAZy families for xylan, chitin, and peptidoglycan degradation are combined, clustering based on province still occurs, suggesting a strong association of volcanic sites with cellulose-degradation and non-volcanic sites with xylan-degradation ( S6 Fig in S1 File).
Since these photosynthate-degrading enzymes are more abundant in surficial sediments than in the freshly-expressed fluids, they are likely degrading surface-derived organic matter rather than chemolithoautotrophic production in the subsurface. The surface-derived substrates that are cleaved by photosynthate-degrading enzymes, however, are unlikely to come from plants, since our sites in the Costa Rica volcanic arc are in a dense jungle and our sites from the Argentina backarc that group with them are from the high altiplano desert, which is extremely dry with little vegetation. If the photosynthate-degrading enzymes were mostly driven by introduction of surrounding plants, then the Costa Rica volcanic zone should have more similar enzymes to the other Costa Rica and Panama sites, since they are close together and have similarly dense vegetation. A more likely potential source of xylan for these springs is thermophilic algae [78,79]. Volcanic springs are often associated with higher temperatures, average 55.4˚C, while the non-volcanic springs are less thermophilic with an average temperature of 37.1˚C. The lower temperatures of the non-volcanic springs are closer to the optimal temperatures of growth for diverse algae, some of which are also well adapted to sulfidic and acidic sites [80]. Therefore, the non-volcanic sites may allow for algae to grow and act as a source of xylan for the heterotrophic organisms. Cellulose, which is more common in volcanic sites, is often found in cyanobacteria [81], which have a higher temperature tolerance than eukaryotic algae [82]. However, we cannot rule out the alternate possibility that these enzymes are responding to delivery of different types of subsurface-derived organic matter in the volcanic vs. non-volcanic systems.

Conclusions
Here we present extracellular carbohydrate-and peptide-degrading enzyme potential from the metagenomes of 63 seeps and hot springs across the Central American and the Andean convergent margin (Argentinian backarc of the Central Volcanic Zone), and Iceland (mantle plume-influenced spreading center). Throughout the seven tectonic-geographic sample groups examined, we see that the heterotrophic community primarily relies on the degradation of proteins rather than carbohydrates. This is supported by the MEROPS annotations that are found in high abundance across all assemblies. The highest CAZyme and peptidase annotations are for families associated with peptidoglycan degradation. This supports the hypothesis that most of the metabolic function for heterotrophs is derived from the degradation of dead microbial cells, consistent with the major source of organic matter in this system being subsurface chemolithoautotrophic production. Very high temperature (>75˚C), low pH sites (< 4), that are heated by volcanic inputs, differ from the rest of the sites based on their CAZymes and peptidases. Except for a few thermophilic peptidases, they had fewer extracellular carbondegrading enzymes, suggesting that the secondary trophic level is less well-developed at these sites, possibly because they must put more energy into survival in these extreme conditions. Except for these high temperature sites, most extracellular CAZyme, hydrolase, and peptidase families did not differ by geological province. This suggests that, even though the taxonomic identities and respirations of the chemolithoautotrophs and heterotrophs vary by geological province [4,7], their organic matter degrading capabilities do not. The exceptions are the photosynthate-degrading enzymes which comprised a minor component of the carbon-degrading enzymes. Volcanic sites had more cyanobacteria-degrading enzymes while non-volcanic arc sites had more algae-degrading enzymes, likely due to the difference in temperature preference of those two types of phototrophs. This study revealed that the secondary community within terrestrial geothermal systems actively participates in the carbon budget within these sites by consuming chemolithoautotrophically-derived dead cell material, with enzymatic capabilities that are independent of geological province.